Evaluation of short-time speech-based intelligibility metrics

نویسندگان

Karen L. Payton

Mona Shrestha

چکیده

The Speech Transmission Index (STI) is based on acoustic measurements in environments and has been shown to be correlated with speech intelligibility under a wide range of acoustic conditions (Houtgast & Steeneken 1984). It is a weighted average of metrics derived from envelope signals in multiple frequency bands spanning the speech spectrum. A variety of methods have been proposed to compute the STI (Houtgast & Steeneken 1971; Steeneken & Houtgast 1980; Ludvigsen 1987; Drullman et al. 1994a, b; Payton et al. 1994; Drullman 1995; IEC 1998; Payton & Braida 1999; Payton et al. 2002; Goldsworthy & Greenberg 2004). Some of these methods use speech as the test stimulus rather than artificially modulated noise as originally proposed by Houtgast and Steeneken (1985). Many of the speech-based techniques have been shown to provide the same result as the traditional STI (Ludvigsen et al. 1990; Payton et al. 2002), which is based on modulation reductions in intensitymodulated noise and as a theoretically derived STI which is obtained from weighted signal-to-noise ratios (SNRs) in seven octave bands and room reverberation time (RT) (Houtgast & Steeneken 1985). To date, all speech-based approaches have used speech materials lasting at least a minute or two to generate metrics correlated with long-term speech intelligibility. Consequently, they have not been used to predict short-time changes in intelligibility due to time-varying environments such as fluctuating background noise. The current work investigates the ability of two speech-based methods to track short-term STI results by using speech segments of various lengths to compute results for environments with stationary speech-shaped noise, speechshaped noise plus reverberation or multi-talker babble. The methods that will be evaluated are the Envelope Regression (ER) and the Normalized Correlation (NC) methods. The ER method is based on the speech-based STI method proposed by Ludvigsen et al. (1990). The NC method was proposed by Goldsworthy and Greenberg (2004) who also analyzed the long-term characteristics of both metrics.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks

Speech enhancement model is used to map a noisy speech to a clean speech. In the training stage, an objective function is often adopted to optimize the model parameters. However, in most studies, there is an inconsistency between the model optimization criterion and the evaluation criterion on the enhanced speech. For example, in measuring speech intelligibility, most of the evaluation metric i...

متن کامل

Project: Evaluation of Speech Intelligibility in Wireless Communication via Speech Recognition Metrics

متن کامل

Outcome measures based on classification performance fail to predict the intelligibility of binary-masked speech.

To date, the most commonly used outcome measure for assessing ideal binary mask estimation algorithms is based on the difference between the hit rate and the false alarm rate (H-FA). Recently, the error distribution has been shown to substantially affect intelligibility. However, H-FA treats each mask unit independently and does not take into account how errors are distributed. Alternatively, a...

متن کامل

On speech intelligibility estimation of phase-aware single-channel speech enhancement

To reduce time and costs in the development process of noise reduction algorithms, an objective intelligibility measure is crucial. Such a measure has to show high correlation with speech intelligibility determined by real listening experiments. In the past several measures were found that perform reliable in a particular scenario when only the spectral amplitude of a noisy signal is modified. ...

متن کامل

Speech Intelligibility of Cochlear-Implanted and Normal-Hearing Children

Introduction: Speech intelligibility, the ability to be understood verbally by listeners, is the gold standard for assessing the effectiveness of cochlear implantation. Thus, the goal of this study was to compare the speech intelligibility between normal-hearing and cochlear-implanted children using the Persian intelligibility test. Materials and Methods: Twenty-six cochlear-implanted childre...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2008

Evaluation of short-time speech-based intelligibility metrics

نویسندگان

چکیده

منابع مشابه

End-to-End Waveform Utterance Enhancement for Direct Evaluation Metrics Optimization by Fully Convolutional Neural Networks

Project: Evaluation of Speech Intelligibility in Wireless Communication via Speech Recognition Metrics

Outcome measures based on classification performance fail to predict the intelligibility of binary-masked speech.

On speech intelligibility estimation of phase-aware single-channel speech enhancement

Speech Intelligibility of Cochlear-Implanted and Normal-Hearing Children

عنوان ژورنال:

اشتراک گذاری